Model Selection

Hybrid Mamba-Transformer Architecture

# Hybrid Mamba-Transformer Architecture

Nemotron H 56B Base 8K

Nemotron-H-56B-Base-8K is a large language model developed by NVIDIA, featuring a hybrid Mamba-Transformer architecture, supporting 8K context length and multilingual text generation.

Large Language Model

Transformers Supports Multiple Languages

Nemotron H 47B Base 8K

The NVIDIA Nemotron-H-47B-Base-8K is a large language model (LLM) developed by NVIDIA, designed for text completion tasks. It features a hybrid architecture primarily composed of Mamba-2 and MLP layers, with only five attention layers.

Large Language Model

Transformers Supports Multiple Languages

Mambavision L3 512 21K

MambaVision is the first hybrid computer vision model combining the strengths of Mamba and Transformer. It enhances visual feature modeling by redesigning the Mamba formulation and incorporates self-attention modules in the final layers of the Mamba architecture to improve long-range spatial dependency modeling.

Image Classification

Mambavision S 1K

The first hybrid computer vision model combining the advantages of Mamba and Transformer, enhancing visual feature modeling efficiency by reconstructing the Mamba formula, and improving long-range spatial dependency modeling by adding a self-attention module at the end of the Mamba architecture.

Image Classification

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase